Palhares et al. BMC Genetics 2012, 13:51 
http://www.biomedcentral.com/1471 -21 56/1 3/51 



Genetics 



RESEARCH ARTICLE Open Access 



A novel linkage map of sugarcane with evidence 
for clustering of retrotransposon-based markers 

Alessandra C Palhares 1 , Taislene B Rodrigues-Morais 1 , Marie-Anne Van Sluys 2 , Douglas S Domingues 2,3 , 
Walter Maccheroni Jr 4,5 , Hamilton Jordao Jr 4,5 , Anete P Souza 6 , Thiago G Marconi 6 , Marcelo Mollinari 1 , 
Rodrigo Gazaffi 1 , Antonio Augusto F Garcia 1 and Maria Lucia Carneiro Vieira 1 * 



Abstract 

Background: The development of sugarcane as a sustainable crop has unlimited applications. The crop is one of 
the most economically viable for renewable energy production, and C0 2 balance. Linkage maps are valuable tools 
for understanding genetic and genomic organization, particularly in sugarcane due to its complex polyploid 
genome of multispecific origins. The overall objective of our study was to construct a novel sugarcane linkage map, 
compiling AFLP and EST-SSR markers, and to generate data on the distribution of markers anchored to seguences 
of sclvana_1 , a complete sugarcane transposable element, and member of the Copia superfamily. 

Results: The mapping population parents (1AC66-6' and TUC71-7') contributed egually to polymorphisms, 
independent of marker type, and generated markers that were distributed into nearly the same number of co- 
segregation groups (or CGs). Bi-parentally inherited alleles provided the integration of 19 CGs. The marker number 
per CG ranged from two to 39. The total map length was 4,843.19 cM, with a marker density of 8.87 cM. Markers 
were assembled into 92 CGs that ranged in length from 1.14 to 404.72 cM, with an estimated average length of 
52.64 cM. The greatest distance between two adjacent markers was 48.25 cM. The sclvana_ /-based markers (56) 
were positioned on 21 CGs, but were not regularly distributed. Interestingly, the distance between adjacent 
5c/rana_7-based markers was less than 5 cM, and was observed on five CGs, suggesting a clustered organization. 

Conclusions: Results indicated the use of a NBS-profiling technique was efficient to develop retrotransposon-based 
markers in sugarcane. The simultaneous maximum-likelihood estimates of linkage and linkage phase based 
strategies confirmed the suitability of its approach to estimate linkage, and construct the linkage map. Interestingly, 
using our genetic data it was possible to calculate the number of retrotransposon sclvana_1 (-60) copies in the 
sugarcane genome, confirming previously reported molecular results. In addition, this research possibly will have 
indirect implications in crop economics e.g., productivity enhancement via QTL studies, as the mapping population 
parents differ in response to an important fungal disease. 

Keywords: Saccharum spp, AFLP, EST-SSR, Retrotransposon-based markers, Single-dose markers, Integrated genetic 
map, Marker distribution 



Background feeds, alcohols, and fertilizers. Brazil is one of the great- 

Sugarcane is a crop of unquestionable importance for est producers and exporters of sugar and ethanol from 
tropical and subtropical regions of the world, where it sugarcane, where cane production reached approximately 
occupied 24 million hectares in 2009 [1]. Sugarcane is a 625 million tons in 2010, and the sugarcane industry 
cost effective renewable resource, with alternative pro- generated a gross annual income of approximately US$ 
duction in food, feed, fiber, and energy e.g. sugar, animal 23 billion [2]. 

Sugarcane exhibits the most complex genome of any 
genetically bred crop. Selection based on scientific 
approaches began in 1888; the first hybridizations were 
conducted in Java and Barbados, between Saccharum 
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officinarum and S. spontaneum. S. officinarum, known as 
'noble' canes, are high in sucrose with juicy, thick stalks; 
low in fiber content, and susceptible to several diseases 
[3]. S. spontaneum is low in sucrose content, but robust, 
and resistant to abiotic stresses and pests [4]. One 
hundred years of interspecific hybrid backcrossings with 
S. officinarum (used as the maternal parent), a process 
called 'nobilizationj lead breeders to obtain more pro- 
ductive varieties, with ratooning ability, and increased re- 
sistance to biotic and abiotic stresses [5,6]. Subsequently, 
all modern sugarcanes derive largely from intercrossing 
these canes, followed by intensive selection [7,8]. There- 
fore, currently grown cultivars are denoted Saccharum 
spp. Varieties are evaluated for rusticity, pest resistance, 
and high sugar yield prior to release, which requires 12 
to 15 years. 

The contemporary sugarcane cultivars have a large (10 
Gb) and complex genome structure that is highly poly- 
ploid, aneuploid (2n = 100 to 130), and have multispeci- 
fic origins [9] with a complete set of homo(eo)logous 
genes ranging from 8 to 10 alleles. Classic cytological 
works as well as fluorescent in situ hybridization deter- 
mined that S. officinarum is an octaploid species 
(2n = 8# = 80) that experienced few aneuploid events, 
and the ploidy level of S. spontaneum varies from 5 to 
16 (2n = 40 to 128) [10,11]. Genome in situ hybridization 
assays reveal that the 'R570' cultivar (2n=115) shares 
80% of its chromosomes with S. officinarum, 10% with S. 
spontaneum, and 10% are recombinant chromosomes 
[12]. These results clearly indicate that two chromosome 
sets coexist in the sugarcane genome. Vegetative propa- 
gation resulted in sugarcane clones with variable 
chromosome numbers cultivated in field plantations, 
and numerical and structural alterations continued to 
accumulate. 

Linkage maps are valuable tools to elucidate genetic 
and genomic organization, particularly in sugarcane due 
to its increased ploidy levels. However, high inbreeding 
depression caused by endogamy limits the production of 
experimental mapping populations, such as F 2 , BCs, 
RILs, and DH lines. The S. spontaneum 'SES 208' 
(2n = 64) linkage map published by Al-Janabi et al. [13] 
was the first map constructed directly from a complex 
polyploid species, based on single-dose markers (or 
SDMs) proposed by Wu et al. [14], which considers the 
use of simplex (single allele copy from one parent) to 
obtain the genetic map. Al-Janabi et al. [13] used pro- 
geny from a cross between 'SES 208j and a diploidized 
haploid derived from anther cultured 'SES 208' and 'ADP 
85-0068' to estimate linkage. This strategy facilitated dir- 
ect meiotic analysis in 'SES 208J and gametic segregation 
ratios to be observed. Results showed autopolyploid 
chromosomal behavior in 'SES 2081 and the proportion 
of SDMs to higher dose markers (multiple alleles) fit the 



assumption of auto-octaploidy, with the absence of re- 
pulsion phase linkages. Subsequently, da Silva et al. [15] 
integrated the map of Al-Janabi et al. [13] with the 
simplex-based map of Sobral et al. [16]. SDMs linkage 
relationships were determined using MapMaker software 
[17]. 

Later, Grivet et al. [18] used selfed progeny to estimate 
linkage for the elite cultivar 'R570'. To date, self- 
fertilized sugarcane progeny are used to map simplex 
and duplex markers on co-segregation groups (CGs); for 
instance, Andru et al. [19] using the software JoinMap 
3.0 [20] constructed a map for the cultivar Louisiana 
'LCP 85-384'. Alternatively, crossing unrelated heterozy- 
gous genotypes generates a segregating sibling popula- 
tion (Fj), which can be valuable for constructing 
outcrossing species maps, including mapping in sugar- 
cane. From the genetic configurations expected in a seg- 
regating Fi, and denoting gel band presence as A, and its 
absence as O, we can make use of the designation AO x 
OO (in a diploid species), or Simplex x Nuliplex (in 
sugarcane) to construct individual maps, one for each 
parent. This approach, known as the pseudo testcross 
strategy [21] uses two sets of dominant markers that 
segregate in a 1:1 ratio. It was applied to estimate link- 
age in interspecific crosses, where S. officinarum was the 
female ('Green German', 'IJ 76-514J 'La Striped'), and S. 
spontaneum the male parent ('IND 81-146' and 'SES 
147B') [22-26]. The Australian cultivars 'MQ77-340; 
'Q165I and 'MQ76-53' were mapped similarly [27-29]. 
Each of these studies determined a comparable number 
of linkage groups (LGs) and map lengths; e.g. 'La 
Striped' (2n = 80): 49 LGs, 1,732 cM; 'SES 147B' 
(2n = 64): 45 LGs, 1,491 cM [26]. 

Garcia et al. [30] constructed a single integrated gen- 
etic map based on simultaneous maximum-likelihood 
linkage estimates and linkage phase methodology [31], 
based on a population derived from a cross between two 
pre-commercial sugarcane cultivars ('SP80-180' x 'SP80- 
4966'). A total of 1,118 single-dose markers were identi- 
fied; 39% were derived from a testcross configuration be- 
tween parents segregating in a 1:1 ratio, and 61% 
segregated in 3:1 ratio, representing heterozygous loci in 
both parentals with identical genotypes. The final map 
was comprised of 357 linked markers, including RFLPs, 
SSRs, and AFLPs assigned to 131 CGs, with a LOD 
score of 5.0, recombination fraction of 37.5 cM. Authors 
indicated the simultaneous maximum-likelihood esti- 
mates of linkage, and linkage phases were appropriate to 
generate an integrated genetic map of sugarcane [30]. 
Then, to enhance existing map resolution, and identify 
putative functional polymorphic gene loci, Oliveira et al. 
[32] screened EST-SSRs and EST-RFLPs in the same 
mapping population. Markers analyzed in the previous 
map were added to 2,303 newly generated polymorphic 
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markers, including 1,669 (72.5%) SDMs; 664 (40%) were 
scattered on 192 CGs, with a total estimated length of 
6,261 cM. 

The current development of expressed sequence-based 
markers such as EST-SSRs, genie SNPs, and TRAPs en- 
rich the genetic data that comprise linkage maps. In 
sugarcane, due to the necessity of mapping a large num- 
ber of markers to guarantee a reasonable coverage of its 
genome with many chromosomes [33], a novel and po- 
tentially useful approach is to compile anonymous and 
putative functional markers. Earlier, the Brazilian Sugar- 
cane Expressed Sequence Tag (or SUCEST) project 
[34,35] generated 237,954 high-quality ESTs organized 
into 43,141 putative unique sugarcane transcripts re- 
ferred to as sugarcane assembled sequences. Based on 
SUCEST data, Rossi et al. [36] developed RFLPs using 
probes derived from NBS-LRR and LRR conserved 
domains, and S-T Kinase type resistance genes, and 
positioned them on the 'R570' map. Besides, Rossi et al. 
[37] conducted a transposable element (TE) search, re- 
vealing a surprising high number of expressed TE homo- 
logues, and found all major transposon families were 
represented in sugarcane. Mutator and Hopscotch were 
later reported as the most represented TE families in the 
sugarcane transcriptome [38]. The SUCEST database 
was used to describe two LTR retrotransposon families, 
which were denoted as sclvana_l and scAle_l [39]. Both 
were reported as complete elements, and different mem- 
bers of the Copia superfamily. The sclvana_l shows low 
copy numbers (40 to 50) and diversity among copies, 
and is expressed under specific conditions in low- 
differentiated tissues; scAle_l exhibits high copy num- 
bers in the sugarcane genome (> 1000), is more diversi- 
fied compared to sclvana_l, and active under varied 
physiological conditions. 

Retrotransposon-based markers have been developed 
using several approaches. In Poaceae species, for ex- 
ample, SSAP was first used to study the distribution of 
iL47?£-i-like retrotransposable elements in barley gen- 
ome [40]. In brief, SSAP uses two restriction enzymes to 
generate a large number of DNA fragments; after that, a 
retrotransposon-anchored PCR is used to perform a se- 
lective amplification. In the 90's, two new techniques 
were developed to exploit the polymorphism generated 
by BARE-1 genome integration, named REMAP and 
IRAP [41]. Patterns indicate that although the BARE-1 
family of retrotransposons is dispersed, these elements 
are clustered or nested locally, and often found near 
microsatellite sequences. Later, both procedures were 
reported as useful to screen insertional polymorphisms 
in populations of Spartina anglica, an allopolyploid 
involved in natural and artificial invasions [42]. These 
methods are dominant and multiplex, and generate an- 
onymous marker bands. IRAP is based on the PCR 



amplification of genomic DNA fragments which lie be- 
tween two retrotransposon insertion sites, and REMAP 
is based on the amplification of fragments which lie be- 
tween a retrotransposon insertion site and a microsatel- 
lite site. Polymorphism is detected by the presence or 
absence of the PCR product in both techniques. Lack of 
amplification indicates the absence of the retrotrans- 
poson at the particular locus. In contrast, the RBIP [43] 
and ISBP [44] score individual loci and are used to 
search for insertional polymorphisms. RBIP, for example, 
was used to address the issue of evolution of rice var- 
ieties [45] and ISBP was used to analyze diversity in 
wheat [46]. The RBIP method exploits knowledge of the 
sequence flanking a TE to design the primers while the 
ISBP method uses one primer in the element and the 
other in the flanking DNA sequence. 

The overall objective of our study was to generate data 
on sclvana_l and scAle_l -based marker distribution to a 
novel sugarcane linkage map based on a compilation of 
AFLPs and EST-SSRs. 

Results 

Genotyping and segregation analyses 

Excellent AFLPs, EST-SSRs, and sclvana_l -based mar- 
ker banding profiles were obtained [see Additional file 
1], despite the size and complexity of the sugarcane gen- 
ome. The 72 enzyme-selective primer combinations 
tested provided a range of AFLP band numbers per gel 
(44 to 174), and polymorphic loci (four to 33). Subse- 
quently, 22 combinations were selected as optimal for 
genotyping the Fi population [Additional file 2]. The 
combinations generated 102 to 172 AFLP bands per gel, 
and 19 to 48 polymorphic loci, which revealed 22.1% 
(685/3,094) segregating loci. Among the segregating loci, 
71.2% (488/685) segregated in only one parent, and 
28.8% (197/685) segregated in both parents (Table 1). 
The 'IAC66-6' clone and 'TUC71-7' variety contributed a 
respective 52% (254/488) and 48% (234/488) of the loci 
that segregated in only one parent. The average number 
of amplified bands and segregating loci per enzyme- 
primer combination were 140.6 (3,094/22) and 31.1 
(685/22), respectively. 

From the 184 EST-SSR loci initially investigated, 22.3% 
(41) were selected for genotyping [Additional file 2]. 
These loci revealed 273 alleles with an average of 6.7 
alleles per locus; 80.6% (220) segregated in the Fi popu- 
lation, 68.6% (151/220) segregating in only one parent, 
and 31.4% (69/220) in both parents (Table 1). The 
'IAC66-6' clone and 'TUC71-7' variety contributed a re- 
spective 43% (65/151) and 57% (86/151) of the alleles 
that segregated in only one parent. 

Among the 16 restriction enzyme-primer combinations 
used to amplify the retrotransposon sclvana_l-based mar- 
kers, six were selected for genotyping the ¥ 1 population. 



Palhares ef al. BMC Genetics 2012, 13:51 
http://www.biomedcentral.com/1471 -21 56/1 3/51 



Page 4 of 16 



Table 1 Marker polymorphisms used for mapping, and distribution of the different markers according to the cross 
type (D1, D2 and C) 



Marker type 


AFLP 


EST-SSR 


sclvana__ 7 


Total 


Number of scorable bands (evaluated in total) 3 


3,094 


273 


357 


3,724 


Number of segregating markers (genotyped) 


685 


220 


8/ 


992 


Number of polymorphic markers between parents 


488 


151 


74 


713 


Number of monomorphic markers between parents 


197 


69 


13 


279 


Number of single dose markers (SDMs) b 


535 


130 


65 


730 


SDMs of origin from 'IAC66-6' [D1f 


197 


-11 


23 


261 


SDMs of origin from TUC71-7' [D2f 


192 


60 


32 


284 


SDMs of origin from both parents [C] d 


146 


29 


10 


185 


Total number of linked markers on the map 


395 


95 


56 


546 


Number of linked markers of origin from 1AC66-6' [D1] 


154 


33 


21 


208 


Number of linked markers of origin from TUC71-7' [D2] 


164 


46 


30 


240 


Number of linked markers of origin from both parents [C] 


// 


16 


5 


98 



a AFLPs generated from 22 enzyme-selective primer combinations, EST-SSR alleles generated from 41 loci and sclvana_ /-based markers generated from 6 
restriction enzyme-primer combinations. 
b Data obtained after Bonferroni's correction. 

c Markers present in only one parent with a 1:1 segregation ratio in the mapping population. 
d Markers present in both parents with a 3:1 segregation ratio in the mapping population. 



These combinations revealed 357 loci; 24.4% (87) behaved 
as segregating loci. Among them, 85.1% (74/87) and 14.9% 
(13/87) segregated in only one parent and in both parents, 
respectively (Table 1). The average number of amplified 
bands and segregating loci per enzyme-primer combin- 
ation were 59.5 (357/6) and 14.5 (87/6), respectively [Add- 
itional file 2]. The male and female parents contributed a 
respective 44.6% (33/74) and 55.4% (41/74) of loci that 
segregated in only one parent. Sixteen retrotransposon 
scAle_l combinations revealed gel profiles; however, 
amplicon absence, or amplifications associated with non- 
specific polymorphism prevented profile use in 
genotyping. 

AFLP and sclvana_l -based loci exhibited similar levels 
of segregating alleles in the mapping population (-25%). 
Notably, both techniques reveal polymorphisms at re- 
striction enzyme cleavage sites. On the other hand, EST- 
SSR loci showed high levels of segregating alleles in the 
mapping population (-80%). The polymorphisms ob- 
served at EST-SSR loci are due to differences in the size 
of multiple alleles; we cannot predict if any allele will be 
fixed in a sugarcane cultivar, which is expected to be 
highly heterozygous. Both parents contributed equally to 
polymorphisms i.e. 49.4% ± 3.7% and 50.6% ± 3.8% of the 
amplicons derived from the male ('IAC66-6') and fe- 
male ('TUC71-7') parent, respectively, independent of 
marker type. 

It is important to clarify that we organized our segre- 
gation data assuming that homo(eo)logous chromo- 
somes paired faithfully during meiosis, leading to regular 
bivalent formation as well as normal gametes. It is im- 
perative to emphasize that sugarcane is an artificial 



genome, highly polyploid, aneuploid, and has interspeci- 
fic origins, which impedes our capacity to designate co- 
dominant markers at any locus. Consequently, loci were 
divided into heterozygous in one parent and null in the 
other (simplex x nuliplex), and heterozygous in both 
parents (simplex x simplex). Based on this model, all 
markers were scored as dominant (or binary), and were 
assigned to the expected segregation ratios i.e. 1:1 and 
3:1 (Table 1). 

A total of 992 segregating loci were genotyped in 
the mapping population; 685 AFLPs (generated from 
22 enzyme-selective primer combinations), 220 EST- 
SSRs (derived from 41 loci), and 87 sclvana_ 1 -based 
loci (obtained from six enzyme-primer combinations). 
The expected segregation ratios at each locus (992) 
were checked using Chi-square tests, adjusting for 
multiple tests using Bonferroni correction. Then, 730 
(73.6%) loci were safely used to build the map, being 
535 AFLPs, 130 EST-SSRs, and 65 sclvana_l -based 
loci. The global level of significance used to determine 
the validity of the segregation ratios of 1:1 and 3:1 was 
5.04e-05 (alpha = 0.05/992). 

The genetic map and marker distribution within 
co-segregation groups 

The marker number used to perform linkage analyses 
was 730; 261 were derived from 'IAC66-6' (here indi- 
cated as Dl), 284 from 'TUC71-7' (or D2), and 185 were 
present in both parents (or C). The final sugarcane map 
was comprised of 546 markers assembled into 92 co- 
segregation groups (CGs), and 184 markers not assigned 
to any CG. Coincidently, Dl (208) and D2 (240) markers 
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were distributed into nearly the same number of CGs 
(51 and 55, respectively). By using the loci that segre- 
gated in a 3:1 fashion (98) as bridges, we provided the 
integration of 19 CGs (1-1, 1-2, 1-3, 1-4, 1-6, 1-8, 1-9, 1-12, 
II I, II-3, III-l, IV-1, Vn-1, U-l, U-2, U-3, U-8, U-13, 
and U-33). The marker number per CG ranged from 
two to 39 (Figure 1). The total map length was 
4,843.19 cM, with a marker density of 8.87 cM. The CG 
length covered a range from 1.14 to 404.72 cM, with an 
estimated average of 52.64 cM. The greatest distance be- 
tween two adjacent markers was 48.25 cM (CG 1-3). 

Forty-one CGs (44.6%) were assembled into seven pu- 
tative independent homo(eo)logous groups (HGs), which 
were recognized based on 82 alleles generated by 24 
EST-SSR loci. The remaining loci (ESTB189, ESTB191, 
CV11, and CV86) did not contribute to CG assembly; 
only one allele was positioned on the respective HG. 
The CG number assembled into HGs ranged from two 
to 15, the largest (HGI) contained 210 markers, and the 
smallest (HGVII) six markers, which was clearly a very 
non-uniform distribution (Figure 1). Eight previously 
mapped EST-SSR loci [32] were placed here on HGI 
(ESTA53, ESTB94, ESTB100, and ESTB118), HGIII 
(ESTA48), HGIV (ESTC80 and ESTB99), and HGV 
(ESTB60). 

The mapped proportion of sclvana_l -based loci was 
86.2% (56/65), greater than AFLPs (73.8%, 395/535) and 
EST-SSRs (73.1%, 95/130). The sclvanaj -based loci 
(56) were positioned on 21 CGs, which were represented 
in four HGs. However, there is some evidence that these 
markers may not be regularly distributed (Figure 1); only 
one marker was placed in groups 1-6, 1-13, III-3, III-4, 
U-l, U-2, U-4, U-7 and U-8, and more than five markers 
were placed in groups 1-1, 1-3, and IV-1. The evidence 
for clustering of sclvana_l-based markers was verified 
by the Chi-square goodness-of-fit test that results in 
a p-value 4.64e-04. This model has been used for testing 
if marker distribution deviates significantly from a ran- 
dom distribution in genetic maps i.e. segregating as 
expected in a Poisson distribution [47]. The number of 
regions with two (6) and three (1) adjacent markers was 
higher than expected only by chance (2 and 0.09, respect- 
ively). Interestingly, distances between some of the ad- 
jacent sclvana_l -based markers were lower than 5 cM, 
notably in groups, 1-1, 1-2, III-2, IV-1, and U-51 
(Table 2). This value was the same used by Rossi et al. 
[36] to define NBS/LRR RGA clusters. 

We subsequently obtained the total sclvana_l -based 
marker numbers (357), divided by the number of 
enzyme-primer combinations used to obtain the ampli- 
cons (6), and the result was ~60. We propose this as the 
number of retrotransposon sclvana_l copies in the 
sugarcane genome. Similarly, by obtaining the total num- 
ber of sclvana_l -based markers each parent contributed 



separately, the following results correspond to the num- 
ber of sclvana_l copies in the respective 'IAC66-6' and 
'TUC71-7' genomes: 53 (316/6) and 54 (324/6). 

However, the total number of scAle_l -based fragments 
was 1008. Nine restriction enzyme-primer combinations 
were used to amplify the scAle_l -based fragments, an 
average of 112 per combination. The results did not fa- 
cilitate the identification of any clear polymorphisms in 
the gels. Consequently, the data were not used for 
mapping studies. 

Molecular validation of sclvana_ 7-based fragments 

Twenty-five sclvana_l -based markers selected for 
genotyping were sequenced; and 64% (16/25) showed 
homology with known nucleotide sequences deposited 
in GenBank (Table 3). Most (13 sequences) showed hom- 
ology with sclvana_l sequences, and others showed 
homology to Zea mays (DIsIvSI337), Oryza minuta (DIs- 
IvSI390, DIsIvSI412), and Sorghum bicolor (SIsIvSI180) 
sequences. The DIsIvLI228 fragment also exhibited 
homology with a Saccharum chloroplast sequence. Six 
of the sequences were mapped, DIsIvSI163, DIsIvSI208, 
DIsIvSI160, SIsIvLI240, SIsIvLI412, and DIsIvLI415, and 
two were tightly linked with EST-SSRs. The DIsIvSI337 
fragment was submitted for tblastx search, and revealed 
similarity with a hypothetical, highly conserved protein 
of unknown function in Arabidopsis thaliana, and other 
species. Similarly, DIsIvSI390 and DIsIvSI412 fragments 
were aligned with an Oryza minuta sequence; tblastx 
indicated similarity with a hypothetical protein of un- 
known function. The SIsIvSI180 fragment exhibited 
similarity to two Sorghum sequences, one to a retro- 
transposon of S. bicolor [48]. Finally, the DIsIvLI228 
fragment exhibited a certain identity with a ribosomal 
protein chloroplast sequence of Saccharum 'SP80-3280' 
(Table 3). 

Discussion 

AFLPs have been used to assess genetic diversity in 
germplasm collections of sugarcane and close relatives 
[49-52], and to build linkage maps [19,53]. In light of 
these studies, AFLPs have been informative in generating 
a substantial amount of unambiguous polymorphic mar- 
kers. For example, Andru et al. [19] reported 64 AFLP 
restriction enzyme-primer combinations, and detected 
816 polymorphic loci; in the present study, 22 combina- 
tions revealed 685 polymorphic loci. We suggest the dif- 
ferences were the result of the genotyping population, 
the first with increased homozygosity (Si progeny) rela- 
tive to a segregating Fi population. In both studies, 
AFLPs were a viable marker to build a scaffold for other 
marker types, including expressed sequence-based mar- 
kers, which could be positioned on the scaffold. Fur- 
thermore, this scaffold is particularly important in 
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(See figure on previous page.) 

Figure 1 Linkage map of the population 'IAC66-6' x TUC71-7'. Genetic distances between adjacent markers are shown on the left of each 
co-segregation group (CG). AFLPs constitute the map scaffold; EST-SSR loci appear in bold with asterisk symbols, and sclvana_ /-based markers are 
depicted in bold in a gray box. The final map was constructed with 546 markers associated with 92 CGs (in Arabic numerals). Forty-one CGs 
(42.7%) were assembled into seven putative independent HGs (in Roman numerals). Other CGs (51) were denoted as unassigned groups (U). 
Note the clustered tendency of some sclvana_ /-based markers. 



sugarcane, as expressed sequences are physically too far 
apart. 

From the 22 enzyme-selective primer combinations 
selected in this study to detect AFLPs, 14 were previ- 
ously tested to build maps for the 'R570' cultivar [53], 
the S. officinarum 'IJ 76-514' clone, and 'Q165' [28], 
'Q117' and 'MQ77-340' [27] cultivars as well as for the 
Fj population 'SP80-180' x 'SP80-4966' [30]. Based on 
shared AFLPs and SSRs, the map comparisons of 'R570j 
'Q165', 'Q117J and 'MQ77-340' cultivars were done [33]; 
several co-segregation groups (CGs) were aligned, and 
homo(eo)logous groups (HGs) associated, which 
received the same designation. As several common 
alleles were positioned, therefore suitable data should be 
available to construct a reference map for sugarcane 
commercial varieties, despite the pedigree complexity. 
For instance, a comparison between the map built based 
on the present study and the one published by Pastina 
et al. [54] is possible. Both are integrated linkage maps 
that share some common SSRs. In both maps, ESTA53, 
ESTB94, ESTB100 and ESTB118 were assigned to HGI 
as well as ESTA48 was mapped in the same grouping, 
HGIII. In addition, HGV contains three CGs that share 
the ESTB60 locus, which was assigned to HGVII of 
Pastina's map [54]. This suggests a possible corres- 
pondence between these HGs. 

Several authors have applied SSR markers to estimate 
linkage in sugarcane [19,30,32,33]. Due to the multiallelic 
nature and relative abundance of SSRs in plant genomes, 
they have utility to identify HGs in polyploid species [33]. 
Based on this principle, Rossi et al. [36] identified 66 
CGs (of 128) assembled into seven HGs from the French 
cultivar 'R570' linkage map. Similarly, Aitken et al. [28] 
identified 136 CGs assembled into eight HGs from the 
Australian cultivar 'Q165' linkage map. Oliveira et al. 
[32] identified 120 CGs (of 192) assembled into 14 HGs 
from a map of the progeny derived from a single cross 
between 'SP80-180' and 'SP80-4966'. 

Here, we compiled 41 CGs (of 92) into seven putative 
HGs. Interestingly, six EST-SSR loci were duplicated 
within chromosomes CV22, CV38, CV78, CV100, 
ESTB14, and ESTB94, which were positioned on HG I 
and IV (Figure 1). Duplicated genomic regions were 
reported to occur in various sugarcane maps 
[23,28,32,33], and are possibly a consequence of the 
multispecific origins of the modern cultivars. Structural 
genomic rearrangements, including the movement of 



transposable elements (TEs) may also explain the dupli- 
cations [55]. 

We used an innovative approach by mapping 
transposon-based markers in sugarcane using the NBS- 
profiling technique. In other plant species, TEs have been 
mapped using SSAP. This marker system was applied in 
barley (namely the TE BARE-1) [40], wheat (TEs, Wis2A- 
1 A and BARE-1) [56], lettuce, (Tlsl and Tls2) [57], and 
tomato (ToRTLl, T265 and Tntl) [58]. Due to the 
advanced knowledge in tomato genetics, it was possible 
to determine that polymorphic insertions were primarily 
located in the centromeric regions. Both the above- 
mentioned approaches increase the available information 
regarding retrotransposon distribution over plant linkage 
groups. The NBS-profiling protocol efficiently targeted 
sclvana_l retrotransposon sequences and, at the same 
time, produced a polymorphic multilocus marker profile 
that was enriched for these sequences. Both the SSAP ap- 
proach and the NBS-profiling technique investigate poly- 
morphic restriction sites and the presence or absence of 
the retrotransposon sequence. SSAP uses two restriction 
enzymes (one with frequent cut sites and the other with 
rare cut sites), generating a large number of DNA frag- 
ments before performing the selective amplification. PCR 
results from the use of a primer that is complementary to 
the adaptor sequence and other complementary to the 
retrotransposon sequence. Since we have to choose two 
enzymes which have no recognition sites for restriction in 
the retrotransposon sequence, it reduces the number of 
combinations (enzyme/primer) to be tested. The NBS- 
profiling technique only uses one restriction enzyme (with 
frequent or rare cut sites), generating a small number of 
DNA fragments to be selected and consequently be 
stained and visualized separately in the gel. This is espe- 
cially important considering the large genome size of 
sugarcane. Using enzymes that have no recognition 
sequences in the retrotransposon sclvana_l (combined 
with primers complementary to its sequence), it was pos- 
sible to estimate the number of copies of this element in 
the genome. 

Among the 16 restriction enzyme-primer combina- 
tions used to amplify sclvana_l, the enzymes Dra\ and 
Sspl resulted in an increased number of scorable bands 
and polymorphic loci compared to Alul and Rsal. Earlier 
studies have shown rare-cutting enzymes such as Dral 
and Sspl are more suitable for restricting sugarcane 
DNA due to its large size. However, enzymes that 
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Table 2 Retrotransposon clustering and its position into the sugarcane genetic map 


Cluster 3 


Markers 


Cluster size a (cM) 


Map position b 










HG 


CG 


1 


D1-RlslvSI530, D1-DlslvLI496, D1-DlslvLII322 


6.64 




1-1 


2 


D2-DlslvLI385, D2-SlslvLI390 


1.74 




1-1 


3 


C-RlslvSI372, C-DlslvLII405 


0.61 




1-2 


4 


D2-SlslvSI210, D2-DlslvLI905 


0.7 


III 


111-2 


5 


D2-DlslvLI494, D2-SlslvLI550, D2-DlslvLII3 1 0 


5.66 


IV 


IV-1 


6 


D2-SlslvLI263, D2-RlslvSI280 


1.14 




U-51 



' Defined by the distance between the flanking markers. 
' HG, homo(eo)logous groups; CG, co-segregation groups. 



frequently cut sugarcane DNA have the potential to gen- 
erate an enormous number of fragments, and conse- 
quently affect other protocol steps, and subsequent 
results. All enzyme-primer combinations resulted in 
non-amplification for scAle_l. Primer design was chal- 
lenging due to sequence diversity among scAle_l copies. 
When amplicons were obtained, all combinations 
resulted in ~ 112 bands per gel, which prevented poly- 
morphism identification. Alternatively, a reduction in 
the number of bands per gel can be reached by adding 
selective bases at the 3'-end of PCR primers, as previ- 
ously demonstrated in barley [40] . Queen et al. [56] used 
SSAP to study the elements Wis2A-lA and BARE-1 in 
wheat, and four selective nucleotides were added as an 



attempt to reduce the number of amplified fragments. 
Both elements are known to have 1,000 copies in the 
wheat genome, and good results were obtained for geno- 
typing and mapping when applying this strategy. Besides 
this, we should try to produce markers derived from 
scAle_l subfamilies, therefore having an estimate of the 
number of copies of each subfamily in the sugarcane 
genome. 

The mean number of amplicons obtained by restric- 
tion enzyme-primer combinations was very similar to 
the number estimated by molecular methods; 40 to 50 
copies of sclvana_l were detected in the sugarcane gen- 
ome, and scAle_l exceeded 1,000 copies [39]. These 
results were congruent with 56 sclvana_l -based loci 



Table 3 sclvana l -based fragments with homology to nucleotide sequences deposited in GenBank (e-value < e" 5 ) 



Marker code 


Size (bp) 


GenBank accession no. 


Homology 


E-value a 


DIslvSI 


138 


DQ1 15032.1 


Saccharum 'SP80-3280' clone SCCCCL6002A07 TntMike, partial sequence 


2e-27 


DIslvSI 


160 


DQ1 15032.1 


Saccharum 'SP80-3280' clone SCCCCL6002A07 TntMike, partial sequence 


8e-27 


DIslvSI 


208 


DQ1 15032.1 


Saccharum 'SP80-3280' clone SCCCCL6002A07 TntMike, partial sequence 


5e-29 


DIslvSI 


310 


DQ1 15032.1 


Saccharum 'SP80-3280' clone SCCCCL6002A07 TntMike, partial sequence 


2e-28 


DIslvSI 


337 


EU969904.1 


Zea mays clone 337091 mRNA sequence 


6e-28 


DIslvSI 


390 


DQ1 15032.1 


Saccharum 'SP80-3280' clone SCCCCL6002A07 TntMike, partial 


5e-178 








sequence 








AC216031.1 


Oryza minuta clone OM Ba0016E09, complete sequence 


1e-70 


DIslvSI 


412 


DQ1 15032.1 


Saccharum 'SP80-3280' clone SCCCCL6002A07 TntMike, partial sequence 


2e-147 






AC216031.1 


Oryza minuta clone OM Ba001 6E09, complete sequence 


4e-60 


SIslvSI 


158 


DQ1 15032.1 


Saccharum 'SP80-3280' clone SCCCCL6002A07 TntMike, partial sequence 


2e-28 


SIslvSI 


180 


AC1 69373.2 


Sorghum bicolor clone SB_BBc0188M08, complete sequence 


3e-36 






FN431662.1 


Sorghum bicolor BAC contig 24P17c, cultivar Btx623 


1e-34 


SIslvSI 


245 


DQ1 15032.1 


Saccharum 'SP80-3280' clone SCCCCL6002A07 TntMike, partial sequence 


2e-28 


SIslvSI 


330 


DQ1 15032.1 


Saccharum 'SP80-3280' clone SCCCCL6002A07 TntMike, partial sequence 


2e-28 


RIslvSI 


195 


DQ1 15032.1 


Saccharum 'SP80-3280' clone SCCCCL6002A07 TntMike, partial sequence 


5e-29 


DIslvLI 


228 


AE009947.2 


Saccharum 'SP-80-3280' chloroplast, complete genome 


3e-90 


DIslvLI 


272 


DQ1 15032.1 


Saccharum 'SP80-3280' clone SCCCCL6002A07 TntMike, partial sequence 


9e-106 


DIslvLI 


415 


DQ1 15032.1 


Saccharum 'SP80-3280' clone SCCCCL6002A07 TntMike, partial sequence 


2e-93 


DIslvLI 


605 


DQ1 15032.1 


Saccharum 'SP80-3280' clone SCCCCL6002A07 TntMike, partial sequence 


2e-102 



3 All are standard nucleotide-nucleotide BLASTn scores. 
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positioned on the linkage map, exhibiting preferential 
cluster distribution along 21 CGs. In large-genome cer- 
eals, Bennetzen [59] reported retrotransposon distribu- 
tion as nested insertions in highly heterochromatic 
transposon clusters. Later, authors reported there 
appears to be some clustering of TE BARE-l/Wis-2-1 A- 
based markers on the linkage map of wheat [56], and 
transposon cluster interference with recombination ma- 
chinery acting in adjacent euchromatic regions in maize 
[60]. When additional genomic data is available for 
sugarcane, it will be interesting to investigate if genes 
adjacent to retrotransposon clusters are less recombino- 
genic. Dooner and He [60] suggested the more con- 
densed chromatin state of retrotransposon clusters in 
maize might interfere with recombination machinery ac- 
cess in adjacent euchromatic regions. Additionally, in a 
recent review published by Kalendar et al. [61] authors 
indicated that at least in cereals and citrus retrotran- 
sposons are often locally nested one into another and 
in extensive domains that have been referred to as 
'retrotransposon seas' surrounding gene islands. 

We are possibly facing an association between clus- 
tered retrotransposon sequences, the inhibition of DNA 
recombination, an explanation of the small map distance 
between adjacent retrotransposon-based markers, and 
the element copy number in plant the genome. This 
should explain sclvana_l properties, such as low copy 
numbers (-60) with expression and mobility under strict 
control, conversely against the properties of scAle_l ret- 
rotransposons. Therefore, mapping scAle_l element is of 
great interest, as well as the location of these two ele- 
ments in chromosome regions. 

The segregation results presented here independendy 
indicated that AFLPs, EST-SSRs, or sclvana_l -based loci 
were consistent with the outcome of former studies 
[26,28,32,53,62] i.e. most markers (-70%) were SDMs. Fur- 
thermore, a substantial number were unassigned markers, 
in addition to variation in the marker number per CG. 

The genetic map constructed here ('IAC66-6' x 
'TUC71-7') has 546 SDMs covering 4,843 cM that were 
ordered in 92 CGs, with a marker density of 8.87 cM. 
The genetic map recently published by Pastina et al. 
('SP80-180' x 'SP80-4966') has 317 markers covering 
2,468 cM that were ordered in 96 CGs, with a marker 
density of 7.5 cM [54]. These are both integrated maps 
constructed based on segregating ¥ 1 populations. Note 
that the number of CGs is somewhat high in Pastina's 
map, but it is shorter and denser. Cultivar maps are 
established using self-fertilized populations, and there- 
fore are not comparable to other maps built based on dif- 
ferent backgrounds. For instance, the 'LCP 85-384' map 
has 784 markers covering 5,617 cM that were assigned to 
108 CGs, with a marker density of 7.16 cM [19], and 
'Q165' map has 910 markers covering 9,058 cM that were 



assigned to 116 CGs, with a marker density of 9.95 cM 
[28]. Note that, in this case, the shorter and denser map 
is the one with a low number of CGs. 

Enhancement of sugarcane genetic maps should in- 
clude additional segregation ratios in mapping analyses, 
and an increased number of informative SNP- and SSR- 
loci segregating in larger populations. In addition, there 
is a need for meiotic studies that it is an important 
component of future studies in deciphering the genetic 
configuration of sugarcane genotypes. 

Finally, it is important to note that the parents of the 
mapping population differ in response to the Sporisor- 
ium scitamineum infection; therefore we expect that off- 
spring segregate for this trait. Consequently, the genetic 
map established here should be used to localize quanti- 
tative loci. It will certainly contribute to a better view on 
the genetic architecture of smut resistance in sugarcane, 
as little is known on this subject [63,64]. As recently 
shown in Pastina et al. [54] integrated genetic maps are 
useful for mapping QTLs. Based on interval mapping 
and mixed models, authors map QTL effects on a segre- 
gating progeny from a cross between two pre- 
commercial cultivars. The same approach should be 
interesting to be applied using the present map that 
includes retrotransposon-based markers. Moreover, we 
should improve McNeil et al.'s strategy [65] by aligning 
marker sequences tightly linked to QTLs for smut resist- 
ance with data from the sugarcane genome sequencing 
project currently underway [66]. 

Conclusions 

The results of this study showed that AFLPs are a viable 
marker to create a scaffold for a linkage map, where 
other marker types can be positioned including expressed 
sequence-based markers. Results indicated the use of a 
NBS -profiling technique was efficient to develop 
retrotransposon-based markers in sugarcane. The simul- 
taneous maximum-likelihood estimates of linkage and 
linkage phase based strategies confirmed the suitability 
of its approach to estimate linkage, and construct the 
linkage map. Interestingly, using our genetic data it was 
possible to calculate the number of retrotransposon 
sclvana_l (-60) copies in the sugarcane genome, con- 
firming previously reported molecular results. In 
addition, this research possibly will have indirect implica- 
tions in crop economics e.g., productivity enhancement 
via QTL studies, as the mapping population parents 
differ in response to an important fungal disease. 

Methods 

Plant material and genomic DNA extraction 

The mapping population was composed of 188 indivi- 
duals derived from a single cross between 'IAC66-6' and 
'TUC71-7'. The male parent 'IAC66-6' is a clone with 
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low sucrose content, large diameter stems, and is sus- 
ceptible to sugarcane smut, a fungal disease caused by 
Sporisorium scitamineum; the female parent, the Argen- 
tinean variety TUC71-7J exhibits a higher sucrose con- 
tent, lower diameter stems, and resistance to smut 
disease. This disease limits the use of recent high- 
yielding sugarcane varieties developed in Brazil. The 
cross was made under field conditions at the CanaVialis/ 
Monsanto Company experimental station, located in 
the northeastern state of Alagoas, Brazil (S 09°39'57"; 
W 35°44'07"). Sugarcane successfully flowers and sets seed 
at the locality due to light period duration (photoperiod), 
and plants were therefore cultivated at this site. Seeds 
were harvested, germinated in plastic boxes, and trans- 
ported to the southeastern state of Sao Paulo (S 22°19'49"; 
W47°10'21") for field cultivation. 

DNA was isolated from young leaves of Fi-progeny 
and parental plants using the CTAB-based extraction 
procedure [67], with minor modifications. DNA concen- 
trations were carefully estimated following electrophor- 
esis on ethidium bromide-stained agarose gels using 
molecular weight standards; aliquots of 50 ng/ul were 
prepared following quantification. 

Generation of AFLP profiles 

AFLPs were amplified based on the protocol described 
by Vos et al. [68] and applied to sugarcane. Briefly, 
250 ng of genomic DNA was double digested with 6U of 
EcoRl (Promega) and Msel (NE Biolabs) in a 25-ul reac- 
tion mixture (10 mM Tris-acetate pH 7.5, 10 mM Mg- 
acetate, 50 mM K-acetate, 5 mM DTT, 1 X BSA) for 4 h 
at 37°C. Restrictions were terminated by heat inactiva- 
tion for 20 min at 65°C. The resulting fragments were 
ligated to adapter sequences by addition of an equal vol- 
ume of ligation solution comprised of 0.25 uM EcoRl 
adapter, 2.5 uM Msel adapter, 1 X enzyme reaction buf- 
fer, and 67 U of T4 DNA ligase (400 units/ ul, NE Bio- 
labs). Incubations were performed for 16 h at 16°C, and 
reactions were terminated by heat inactivation. The 
adapter-ligated DNA (3 ul) was used for pre-selective 
amplification with primers based on the adapter 
sequences with one selective nucleotide at the 3' end 
{EcoRl + A and Msel + C). Pre-selective amplification was 
performed in a 20 ul reaction mixture containing 
1.5 mM MgCl 2 , 0.5 mM each dNTP, 250 nM each pri- 
mer, 1 X enzyme reaction buffer, and 3 U Taq DNA 
polymerase (Promega). Amplifications were conducted 
under the following conditions: 94°C for 2 min; 26 cycles 
of 94°C for 60 s, 56°C for 60 s, 72°C for 60 s; and a final 
elongation at 72°C for 5 min. For the selective step, 
1.5 ul of a 5-fold water diluted pre-selected PCR product 
was used as DNA template. The 20 ul reaction mixture 
contained 1.5 mM MgCl 2 , 0.2 mM each dNTP, 250 
nM £coRI + ANN, 300 nM oligo Msel + CNN, 1.6 U 



Taq DNA polymerase (Fermentas), and 1 X enzyme 
reaction buffer. Selective amplification was conducted 
under the following conditions: 94°C for 2 min; 12 
cycles of 94°C for 30 s, 65°C for 30 s, 72°C for 60 s; 
the final 23 cycles had similar conditions with the excep- 
tion of a 56°C primer annealing temperature, and a final 
elongation at 72°C for 2 min. Following PCR, the amp- 
lified products were mixed with an equal volume of de- 
naturing buffer containing 95% formamide, 10 mM 
EDTA (pH 8.0), 0.2% bromophenol blue, and 0.2% xy- 
lene cyanol. Samples (3 uL) were loaded into 5% (w/v) 
polyacrylamide gels (acrylamide/bis-acrylamide, 19:1). 
Electrophoresis was performed at a constant power of 
70 W for 4 h, using a Sequi-Gen® GT (Bio Rad) appar- 
atus. Gels were silver-stained according to the protocol 
described by Creste et al. [69]. 

Seventy-two different restriction enzyme and select- 
ive primer combinations were examined using DNA 
from both parents in duplicate reactions. Combina- 
tions that exhibited good profiles, and revealed a large 
number of loci and polymorphism rates (> 20%) were 
selected for genotyping the Fi population. The poly- 
morphism rate between parents was calculated by 
assessing the number of bands present in one parent 
and absent in the other, in relationship to the total 
number of amplified bands. 

EST-SSRs amplification 

In analyzing the SUCEST database, Pinto et al. [70] 
identified 2005 clusters containing SSRs. Primer sets 
were subsequently developed from these data, and used 
in polymorphism analyses [70-73]. In addition, Maccher- 
oni et al. [74] analyzed 352 and 122 sugarcane ESTs 
available in both public [75] and private [76] databases 
to establish sugarcane SSRs. Primer sets were developed 
from these sequences. In the present study, we used 
published [72-74], and non-published primer sets devel- 
oped by CanaVialis/Monsanto. 

EST-SSRs amplification was performed in a final 
volume of 10 ul in 96-well thermocycler plates. Ap- 
proximately 20 ng of template DNA was mixed in a 
solution of 0.25 uM of each forward and reverse pri- 
mer, 0.2 mM each dNTP, 2.0 mM MgCl 2 , IX Color- 
less Go Taq buffer, and 1.0 U Go Taq Flexi DNA 
Polymerase (Promega). Amplifications were performed 
using two thermal cycling programs. The first pro- 
gram was conducted under the following conditions: 
94°C for 3 min; 31 cycles of 94°C for 60 s; primer 
annealing at varied temperatures for 60 s; extension 
at 72°C for 60 s; and a final elongation at 72°C for 
2 min. The second was conducted under the following 
cycle parameters: an initial denaturation step at 94°C for 
5 min; followed by 35 cycles of 94°C for 30 s; primer 
annealing at varied temperatures for 30 s; extension at 
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72°C for 30 s; and a final elongation at 72°C for 60 min. 
PCR products were analyzed by two methods, denoted S 
and F. Amplicons were resolved by 5% (w/v) denatur- 
ing polyacrylamide gel electrophoresis, and silver 
stained (S) as above described; or capillary electro- 
phoresis using a MegaBACE 1000® genotyping system 
(GE Healthcare Life Sciences). For capillary electro- 
phoresis, forward primers were labeled with fluores- 
cent dyes (F) (fluorophore NED or 6-FAM, Applied 
Biosystems), and fragments were verified with the 
Fragment Profiler version 1.2®. 

Polymorphisms between parental genotypes were 
assessed by amplifying 184 EST-SSRs using DNA from 
'IAC66-6', 'TUC71-7', and a sample from the mapping 
population {¥{). The data included 33 EST-SSRs devel- 
oped by Oliveira et al. [72], two from Marconi et al. 
[73], and three from Maccheroni et al. [74]. In addition, 
146 EST-SSRs were available from CanaVialis/Monsanto 
(unpublished data). Results showed 41 polymorphic loci, 
which did segregate in the progeny sample; therefore, 
these loci were applied to genotype the mapping popula- 
tion. Details on these sugarcane EST-SSRs are presented 
in Table 4. 

Marker generation based on sugarcane retrotransposon 
sequences 

The principle NBS-profiling technique [77] was applied 
according to Hanai et al. [78] to generate markers based 
on two retrotransposons named sclvana_l (GenBank Ac- 
cession Number JN800016) and scAle_l (GenBank Ac- 
cession Number JN800006). Approximately 500 ng of 
genomic DNA were digested with Alu\, Dral, Sspl, or 
Rsal (NE BioLabs). Digestions were performed in a final 
volume of 30 ul using 7.5 U of enzyme for 7 h at 37°C, 
according to the manufacturer's recommendations. Re- 
actions were terminated by heat inactivation (20 min at 
65°C). Adapters were prepared by incubating equimolar 
amounts of LA (long arm) and SA (short arm) oligonu- 
cleotides at 65°C for 10 min, and respectively cooled to 
37°C and 25°C (10 min each). The SA oligonucleotide 3' 
end was blocked for Taq DNA polymerase extension by 
the addition of an amino group, but phosphorylated at 
the 5' end, which results in an adapter primer-annealing 
site only following the first PCR cycle. Subsequently, the 
digested material and a solution containing a 1.6 uM 
adapter (when restricted with Alu\ or Rsal), or a 0.2 uM 
adapter (when Dral or Sspl was used), 1 X ligation buffer 
(NE BioLabs), and 67 U T4 DNA ligase (400 units/ul; NE 
BioLabs) were mixed in equal volumes (30 ul), totaling 
60 ul. Ligation was performed at 16°C for 16 h, and ter- 
minated by heat inactivation at 65°C for 20 min. The 
ligation products (diluted to 5 ng/uL) as template DNA 
were used to amplify selected fragments anchored to the 
retrotransposon sequence. A final volume 20 ul reaction 



mixture contained 4 ul of ligation products, 300 nM of 
each primer (a primer complementary to the adapter, 
and a primer complementary to the retrotransposon se- 
quence, Table 5, Figure 2), 1.5 mM MgCl 2 , 0.2 mM each 
dNTP, 1 X buffer enzyme, and 2 U Taq DNA polymerase 
(Fermentas). PCR was conducted under the following 
conditions: an initial denaturation step at 94°C for 5 min; 
followed by 8 cycles at 94°C for 45 s, 58°C (- 1°C per 
cycle) during 50 s, and 72°C for 1 h:15 min; 25 cycles at 
94°C for 45 s, 50°C for 50 s, and 72°C for 1 min; and a 
final extension at 72°C for 10 min. After PCR, the proto- 
col for preparing AFLP gels was completed, followed by 
electrophoresis, and gel staining. 

Sixteen restriction enzyme and retrotransposon com- 
plementary primer combinations (for each sugarcane 
retrotransposon) were examined using the DNA from 
both parents in duplicate reactions. Combinations that 
exhibited clear band distribution over the gels, and 
revealed polymorphism rates > 15% were selected for 
genotyping the Fi population. 

Marker nomenclature, genotyping and segregation 
analyses 

The nomenclature adopted for AFLP primers followed 
the Keygene standard primer list [80] followed by the 
corresponding molecular size (in bp) of the band. A pri- 
mer code was adopted for EST-SSR loci (ESTA, ESTB, 
ESTC and CV) followed by the molecular size of the al- 
lele (in bp). The retrotransposon-based locus nomencla- 
ture was an abbreviation that indicated the enzymes 
used in the digesting reaction (Al for Alul, DI for Dral, 
SI for Sspl, and RI for Rsal) followed by the primer code 
(Table 5), and the molecular size of the fragment (in bp). 
Wu's [31] loci-segregation pattern notations follow all 
marker abbreviations e.g., D1-E35M47510 (Figure 1). 
We assumed the presence of a band, denoted by A, 
dominant over all 0 or null alleles (or simplex configur- 
ation with eight homo(eo)logous copies in the sugarcane 
genome), independent of marker type. Loci were 
denoted as "Dl" when 'IAC66-6' was heterozygous for 
band presence, and the other parent was homozygous 
for band absence {A0000000 x 00000000). Loci were 
denoted as "D2" when 'TUC71-7' was heterozygous and 
'IAC66-6' was homozygous. These loci were expected to 
segregate in a 1:1 fashion in the ¥ 1 population. "C" loci 
were heterozygous in both parents (A0000000 x 
A0000000), and exhibited a 3:1 segregation ratio. Differ- 
ences between observed and expected proportions were 
compared using Chi-square test, assuming a polyploid 
model based on single-dose markers (SDM) for analyz- 
ing segregation in outcrossing species [14]. For minimiz- 
ing problems caused by multiple comparisons, the 
Bonferroni correction was performed. Chi-square tests 
and Bonferroni adjustments for the effects of multiple 
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Table 4 Details on the sugarcane microsatellite loci derived from expressed sequence tags (ESTs) 


Marker 
code 


Repeat 
motif 


Forward primer (5'^3') 


Reverse primer (S'^B") 


PCR e 


AT f 

AT 


D 9 


Allele number 
and size range h 


ESTA26 a 


CTG)n 


GGCAGCCCCACATCTTCCT 


GGGCACAAGCATCCGAACC 


, 


56.0 


S 


4 


172-186 


ESTA48 a 


(CA) 8 


AGCAACTCCGGCCTGTCCTG 


CTTTCTGI I I IGCTCCTCCGTCTG 


! 


62.7 


s 


10 


233-295 


ESTA53 a 


(TG) 8 


TGGAAATGGCAGCTGGTGTCGT 


ATGCACGTACCAGAGGGAGATTTG 


1 


58.9 


s 


9 


168-192 


ESTA61 a 


(AT) 12 


ACCTGAGTCTCCTCCTCAACC 


TATACTACACATGCACAGGCTACG 




56.4 


s 


5 


236-246 


ESTB14 a 


(CCDs 


TGAGGGAATGAATGGACTGG 


CCACCACCACCATACCTGTC 




52.0 


s 


9 


285-315 


ESTB55 a 


(CCA) 5 


CTTCTTGGCCTTGGCGTTACTGA 


GCTAGCTGGCCCCATTTCCTCT 


! 


60.0 


s 


3 


118-124 


ESTB60 a 


(TTG) 10 


AGCCGCAATGAATCCAACTG 


CTCTAGCTCCGACGATGATACGTG 


! 


61.0 


s 


8 


157-206 


ESTB82 a 


(CGT) 9 


CGTCGATCGAGATGAAGAAGG 


GAAGCAGTCGTGGAAGTGGAG 


1 


62.7 


s 


5 


245-263 


ESTB94 a 


(CTT] 9 


GAGGCAGCCAGGCAGGTCAG 


GGTGGCAGTGTTCAGGCAGATG 


! 


61.0 


s 


10 


210-279 


ESTB99 a 


(JCG) 5 


GAGGTCCTTCTTGTAGTTGTATGC 


GTGCCGGAGGATTTGATG 


, 


64.7 


s 


4 


215-224 


ESTB100 a 


(JCG) 6 


CCACGGGCGAGGACGAGTA 


GGGTGCTTCTTCGGCTCGTG 




64.7 


s 


13 


240-278 


ESTB118 a 


0TC) 6 


GTTGGGTAGGGTTTCTTGAGTCGT 


CATGGCTTTTGGGTTGCTTCT 


, 


61.0 


s 


5 


106-163 


ESTB189 b 


(TCA), 0 


GTAAGGAAGAAGCAACAAACAACAG 


GATTCGATGCAACTCTCCTGTAAA 


! 


60.0 


s 


5 


261-280 


ESTB191 b 


(GCT) 5 


GCGCCATCAGGGAAGCCAAAAC 


GCGCGTGCGAGCAGATGAAC 


! 


60.0 


s 


5 


213-226 


ESTC80 a 


(ATTC) 3 


ATTCTTTCTCCCCGTGTTGTGC 


GTCGCCAGATCGCTTTCGTT 


! 


58.9 


s 


/ 


188-292 


CV06 C 


(AATT) 13 


TCTCAAGCTTCGCCAGCTA 


TGGCTCGGCTGTAGGAATTA 


2 


60.0 


s 


3 


188-230 


CV11 c 


(GAA) 6 


TGGCATGTGTCATAGCCAAT 


CCCCAACTGGGACTrTTACA 


2 


60.0 


s 


6 


227-242 


CV22 c 


(AGGQ5 


CACTACTCGCCCCGATTTC 


CGAGTGCTTCTCCATCTGC 


2 


64.0 


F 


8 


140-166 


CV23 c 


(GGAA)//(AGG) 6 


GAACTGCTCACTGGCTCGTC 


GTAGAAGTCCGTCGCCGTAA 


2 


64.0 


F 


9 


150-206 


CV24 C 


(CCAA) 5 /<CACCT) 4 


TCGGAGAAGTTGACCGAGTT 


GGTTTAGAGTTGGGGCCTTC 


2 


60.0 


F 


/ 


187-205 


CV29 d 


(ATCT) 14 


TCGCGTCCACCAATGTAACC 


GCGTGCATCGGTTGTGTCTT 


2 


64.0 


F 


10 


85-133 


CV37 d 


0TTC) 15 


GGATGGACGACGTGTCCTGG 


ATAAAGTGGCCGCTTGGATTGA 


2 


64.0 


F 


6 


117-155 


CV38 d 


(CI I I l) ls 


GAAGCAGGGGCCTCAAGTTG 


GTCAAACAGGCGATCTGGCTC 


2 


64.0 


F 


9 


109-199 


CV46 C 


(GGTAAJn 


TGTTCCAAGTTCATGCGCTCC 


ATGCATGCAGGTTCAAAAGCAG 


2 


64.0 


F 


5 


146-188 


CV51 c 


0"GT) 13 


CTACCCCAACTTGCTTGGGAC 


GACTGGAACAAAGACGGACTG 


2 


64.0 


F 


3 


147-160 


CV53 c 


(A A A AT) 5 /(TTTAT) 6 


CCCCACCGTAGCTTGTGCAT 


AAACGTGCACATGCTTGTATGC 


2 


64.0 


F 


/ 


160-183 


CV58 c 


(ATAGAT) 10 


CGGGTAGTTAGGAGGAGATGG 


GTCATCCATTTTGGAACGAATGG 


2 


64.0 


F 


6 


153-195 


CV78 c 


(CTGTG) 9 


ACGAGGCCACCATAGAACATG 


GCAATTGGGAGGAGAGGAATG 


2 


64.0 


F 


9 


144-203 


CV79 C 


(CTATAT) , , /(TATAG A) 6 


GGCACTGCTGGTGGTTGATTG 


TCCCACATCAAGAGGCAGCTA 


2 


64.0 


F 


7 


136-197 


CV86 C 


(AATT) S 


CCTCAGCAGCCCAAAGTCCT 


GTCGGAATCAGCCGGATTAGC 


2 


64.0 


F 


5 


159-187 


CV91 c 


(GCC) 6 /(GCA) 6 


AAAGGAAATCGCCGTCCGTCT 


CCGATGATGAGCCAGCAATCC 


2 


64.0 


F 


8 


175-197 


CV94 C 


(AAAAAG) 5 /(CGT) 5 


GGCAGGCCAAGATGAATGAAG 


AGCACAGCGGAGGGTACGG 


2 


64.0 


F 


4 


187-205 


CV1 00 c 


(GAG) 13 


CTGTTGAGGAGCCGGATGAG 


CTCTTCCGATGGCTCGGTCT 


2 


64.0 


F 


9 


222-256 


CV101 c 


(ATC) 23 


GTCGTGGTCGTCACGATCATC 


AGTTGACGGCATGGTTCTTGC 


2 


64.0 


F 


11 


111-180 


CV1 04 c 


(TCCTG) 5 


GATnTCGACTGTGCGCTTGG 


AAGTTCTCTGCCGGAGCAAAC 


2 


64.0 


F 


6 


133-158 


CV106 C 


(GGC) 8 


AAACAGAGCATACTCGAGGCC 


ACGTTGCTGACGAGG I I I ICC 


2 


64.0 


F 


6 


146-161 


CV1 1 5 c 


(TCACAG) , o/(GTA)6/(AGA) 5 


GTCCATGTCCATCCATGATCG 


GGAGCTCCGTCTTCTTGTTAC 


2 


60.0 


S 


6 


150-174 


CV1 1 9 c 


(AAAAC)/ 


TATCTCTCCTTGGTTTGGATGG 


CACCCTACCAAATACCACAACA 


2 


64.0 


F 


5 


121-175 


CV1 28 c 


(GCA) 13 


AGGGCAACGGAGTCTTCGAC 


CTGAACTCCGATGTGCTGGTG 


2 


60.0 


F 


5 


147-168 


CV135 c 


(AAG), 6 


AGCAAAACCAGCCTTCCCTTC 


CTGTTTGTTTCTGCTTGCTTGC 


2 


64.0 


F 


6 


129-159 


CV144 C 


(TCTCCG) 5 


GCGCCTCCGTGGATAAGAATC 


CCTTCCCCTACAGCGCCTAC 


2 


64.0 


F 


5 


146-164 



a ' b ' c ' d Develop by Oliveira et al. [72], Marconi ef al. [73], CanaVialis (unpublished); Maccheroni et al. [74], respectively. 
e PCR program as described in Material and Methods. 
f Annealing temperature in the amplification reaction. 

9 D: Silver-stained polyacrylamide gel electrophoresis (S) or fluorescence-based automated capillary electrophoresis (F) for the detection of EST-SSR alleles. 
h Observed number of alleles per locus and their size ranges in bp. 
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Table 5 Primer sequences used for the generation of 
sugarcane retrotransposons-based markers 

Primer 

Short arm oligonucleotide 
(AS) 

Long arm oligonucleotide 
(LA) 

Adapter primer (AP) 
sclvana /_SSAP1 (sIvSI) 
sclvana J_GagRev (sIvGR) 
sclvana J_LTR1 (sIvLI) 
sclvana ?_LTR2 (sIvLII) 
scAle /_LTRr (sAILr) 
5cAW_RT (sAIRT) 
scAW_LTR1 (sAILI) 
5cAW_LTR2 (sAILI) 



Sequence 5'^3' a 

TGGGATCTATACTT - H 2 N 

ACTCGATTCTCAACCCGAAAGTATAGATCCCA 

AGTGGATTCTCAACGCGAAAG 
CAAGCCCTTAATAGGAGAAA 
TCCCTGTATACAACGCTGTC 
AGTCCTGCTCCCAGTTATCA 
GTCGCCTGGGTGTGTTATC 
ATACATGGGCGAGATGGG 
CCTCCCDTCCTCGACCTTC 
GCATGWGRGTAGGCGCATGTGGC 
GGGGTGTTGGAGTGTGATTG 



a D = A, G or T; R = A or G; W = A or T. 

comparisons were performed using the software R, v. 
2.13.0 [81]. Loci with segregation distortion were not 
included in linkage analysis. 

Linkage analyses, map construction, and identification of 
sugarcane homo(eo)logous groups 

All linkage analyses were performed using OneMap soft- 
ware [82]. Version 2.0-1 was preferred to construct a 
multipoint maximum likelihood linkage map applying a 
Hidden Markov Model approach [83]. 

Firstly, co-segregation groups (CGs) were established 
using a LOD score > 6.0, and a recombination fraction < 
0.35. For groups with six or less markers, the best order 
was obtained by comparing all possible orders choosing 
the one with highest likelihood using the algorithm 
implemented in the command named "compare". To ob- 
tain the best order for larger groups (more than six mar- 
kers), the command "order.seq" was applied. In this 
case, the likelihood was the criteria used to place the 



markers along the CGs under a multipoint approach, 
as validated by Mollinari et al. [84]. Additionally, the 
"ripple" command was used to check for alternative 
orders, as well a visual inspection on the matrix con- 
taining the pairwise recombination fractions and LOD 
scores (heatmaps) for the CGs. The commands "com- 
pare", "order.seq" and "ripple" were similar to those in 
the MAPMAKER/EXP software [17]. Finally, multipoint 
estimates of recombination fractions were calculated 
and converted into linkage distances using the Kosambi 
map function [85]. Map drawings were generated using 
MapChart 2.2 [86]. 

Due to the multiallelic nature and known polymorph- 
isms, EST-SSR loci are valuable in recognizing homo(eo) 
logous groups (HGs) in sugarcane. Initially, CGs were 
assigned to HGs if they contained at least two of the 
same EST-SSRs. In addition, CGs were putatively added 
if they contained an EST-SSR locus in common with the 
HG [28,30,32,54]. Using this practice, two HGs were 
established, I and IV. Then due to the insufficient 
amount of SSRs, in a number of instances only one locus 
was used to assign CGs to the following putative HGs, 
II, III, V, VI and VII [30,32]. We applied Roman numer- 
als to denote HGs; within each HG, CGs were classified 
in a descending order according to size (in cM). The un- 
assigned groups were designated U, and also classified 
according to their size. 

Finally, to test if sclvana_l -based markers have a ten- 
dency to be clustered along the genome, we used an ap- 
proach similar to the one presented by Echt et al. [87]. 
The genetic map was divided in 10 cM bins and the 
number of sclvana_l -based markers in each interval was 
recorded. If markers were randomly distributed, they 
would follow a Poisson distribution [47], defined as P(x) = 
X x e" x /x!, where P(x) is the probability function; x is the 
number of markers observed in the intervals (ranging 
from 0 to 3), A is the distribution parameter calculated as 
average number of markers per interval in the map. A 
Chi-square goodness-of-fit test was performed, with 2 



sclvana 1 



-Gag- 



Pol - 



LTR 5' 



PBS 



CP 



PR 



INT 



RT 



PPT 



LTR 3' 



scAle 1 



-Gag- 



Pol- 





LTR 5' 


PBS 


CP 




PR 


INT 


RT 


RNASE H 


PPT 


LTR 3' 















Figure 2 Structure of sugarcane retrotransposons sclvana_l and scAle_1. Retrotransposons are LTR (long terminal repeats) consisting of 
elements within transcription initiation and termination sequences and detected as Gag, Pol, and Int domains that code for CP (capsid-like 
proteins), PR (protease), RT (reverse transcriptase), RNAase-H (ribonuclease H), and INT (integrase). Other sequences featured are PBS (primer 
binding sites), and PPT (polypurine tracts). Arrows indicate the primers designed for amplifying each of the elements, and synthesis direction. 1: 
sclvana_ 7-SSAP1; 2: sclvana_ f-GagRev; 3: sc/vana_/-LTR1; 4: sclvana_ /-LTR2; 5: scAle_ /-LTRr; 6: scAleJ -RT; 7: scAle_ 7-LTR1; 8: scAle_1 -LTR2. Figures 
were not drawn to scale and were adapted from Kumar and Bennetzen [79]. 
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degrees of freedom (df=c -1- r, where c is the number 
of classes and r is the number of estimated parameters, 
r=l). 

Molecular validation of sclvana_ 1 -based fragments 

Verification that amplified DNA fragments were derived 
from retrotransposon templates was conducted by excis- 
ing DNA fragments from polyacrylamide gels, eluting 
DNA in a TE solution (10:1), and re-amplifying the 
DNA fragments. Five ul of the diluted DNA mixture was 
added to 50 uL of the same solution used for 
retrotransposon-based marker generation, however pri- 
mer concentration was 30 nM, and Taq DNA polymer- 
ase was 5 U. The PCR program was simplified and 
conducted under the following conditions: an initial de- 
naturation step at 94°C for 5 min; 30 cycles of 94°C for 
30 s, 55°C for 30 s; 72°C for 42 s; and a final extension 
at 72°C for 10 min. For sequencing, PCR fragments were 
resolved on agarose gels, purified with the QIAEX II Gel 
Extraction kit (QIAGEN), and cloned into pMOS Blue 
Blunt-Ended Cloning Kit (GE Healthcare Life Sciences). 
Inserts were sequenced in the forward direction. Se- 
quencing reactions were performed according to Sanger 
et al. [88] using DYEnamicTM ET Dye Terminator 
Cycle Sequencing Kit (Amersham Pharmacia Biotech, 
Inc.) on an ABI 3730 system (Applied Biosystems). Se- 
quence quality was examined using the Phred/Phrap/ 
Consed package [89]. Nucleotide sequences were com- 
pared to reference data available at Genbank by BLAST 
analysis [90]. 

Additional files 



Additional file 1: Amplification patterns obtained from AFLP, 
EST-SSR, and sclvana_ 7-based markers for the sugarcane mapping 
population. Several segregating alleles are shown, as well as molecular 
weight standard (lane M) fragment sizes. Codes correspond to parental 
and F r progeny genotypes. 

Additional file 2: Genetic information provided by the AFLP, 
EST-SSR and the retrotransposon sclvana_ 7-based loci. 
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